ROC-1: Hardware Support for Recovery-Oriented Computing

نویسندگان

  • David L. Oppenheimer
  • Aaron B. Brown
  • James Beck
  • Daniel Hettena
  • Jon Kuroda
  • Noah Treuhaft
  • David A. Patterson
  • Katherine A. Yelick
چکیده

We introduce the ROC-1 hardware platform, a large-scale cluster system designed to provide high availability for Internet service applications. The ROC-1 prototype embodies our philosophy of Recovery-Oriented Computing (ROC) by emphasizing detection and recovery from the failures that inevitably occur in Internet service environments, rather than simple avoidance of such failures. ROC-1 promises greater availability than existing server systems by incorporating four techniques applied from the ground up to both hardware and software: redundancy and isolation, online self-testing and verification, support for problem diagnosis, and concern for human interaction with the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Embracing Failure: A Case for Recovery-Oriented Computing (ROC)

Motivated by the lack of availability demonstrated by current approaches to building servers for the Internet environment, we argue for a new approach to building highly-available systems that better reflects the realities of the modern server environment, namely that failures of hardware, software, and humans are inevitable. Our approach, denoted recovery-oriented computing (ROC), recognizes t...

متن کامل

Recovery-Oriented Computing: Main Techniques of Building Multitier Dependability

Frequent freezes and crashes on current systems bring tremendously heavy loads to the system administration, directly resulting in an undesirable increase on the total cost of ownership (TCO). Obviously, it is time to broaden the long lasting performance-dominated research focus, which has neglected other aspects of computing such as dependability, availability and stability. Deeming that softw...

متن کامل

A Simple Way to Estimate the Cost of Downtime

It is time for the systems community of researchers and developers to broaden the agenda beyond performance. The 10000X increase in performance over the last 20 years means that other aspects of computing have risen in relative importance. We see three challenges for the future as [Patterson2002a]: 1. Synergy with humanity: We need to make the technology match human nature, both for users of pe...

متن کامل

The impact of Cloud Computing in the banking industry resources

Today, one of the biggest problems that gripped the banking sphere, the high cost of implementing advanced technologies and the efficient use of the hardware. Cloud computing is the use of shared services on the Internet provides a large role in developing the banking system, without the need for operating expenses including staffing, equipment, hardware and software Reducing the cost of implem...

متن کامل

The impact of Cloud Computing in the banking industry resources

Today, one of the biggest problems that gripped the banking sphere, the high cost of implementing advanced technologies and the efficient use of the hardware. Cloud computing is the use of shared services on the Internet provides a large role in developing the banking system, without the need for operating expenses including staffing, equipment, hardware and software Reducing the cost of implem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Computers

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2002